Skip to content

Add RTP/v3 interface design#362

Open
JoTurk wants to merge 1 commit into
mainfrom
jo/rtp-v3
Open

Add RTP/v3 interface design#362
JoTurk wants to merge 1 commit into
mainfrom
jo/rtp-v3

Conversation

@JoTurk
Copy link
Copy Markdown
Member

@JoTurk JoTurk commented May 25, 2026

Description

Note: that this is V3 because rtp/v2 was released by mistake a few years ago, This moves the library more towards a generic RTP library, this is backward compatible with v1 for users with custom packetizers/depacketizers, but we should drop this in v4.

Today the payloader interface is:

type Payloader interface {
    Payload(mtu uint16, payload []byte) [][]byte
}

and the generic webrtc-centeric packetizer builds RTP packets afterward. In the current interface, the market bit is hardcoded as "true on the last emitted payload fragment" which was true for many webrtc rtp payloads.

However, RTP defines the market bit as profile-specific not last fragment, RTP-MIDI is a clean example of the limitation of this interface #361 This RFC says the marker bit depends on the media type, for mpeg-genric for example it's always one, and this can't be expressed by "last payload wins".

When it comes to the depacketizer, the API has related problems:

type Depacketizer interface {
    Unmarshal(packet []byte) ([]byte, error)
    IsPartitionHead(payload []byte) bool
    IsPartitionTail(marker bool, payload []byte) bool
}

Unmarshal only receives the RTP payload, IsPartitionHead receives only payload bytes, and only IsPartitionTail gets the marker bit. The full RTP header, timestamp, sequence number, extensions, padding state, SSRC, and payload type are not available to the depacketizer methods.

Then, The samplebuilder then calls the depacketizer in a way that reinforces this limitation: it checks the tail with Marker and Payload, checks the head with only Payload, and appends the result of Unmarshal(payload) to the sample data.

This broke down for me when I was finishing av1 support because RTP can omit the length field, and had to make the interface stateful to deal with stuff like W !=0, and it wasn't not possible to honor the push / append model.

Many depacketizers, carry mutable depacketization state such as fuaBuffer in H264Packet, so a packet type is also acting as a stateful depacketizer.

this refactors the interface into

codec bitstream reader/writer
        ↓
RTP payload-format packetizer/depacketizer
        ↓
generic RTP packet sequencer/parser
        ↓
sample builder / jitter buffer / transport integration

which let's us define interfaces for codecs that have different input payloads e.g accepting avc and annex-b h264 without having hacks or random options, Also being able to write avc or annex-b as output without multiple transformation.

Reference issue

Fixes #69 and fixes #70

@JoTurk
Copy link
Copy Markdown
Member Author

JoTurk commented May 25, 2026

(I submitted without writing the description, writing it with edit ig.)

@codecov
Copy link
Copy Markdown

codecov Bot commented May 25, 2026

Codecov Report

❌ Patch coverage is 76.92308% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.58%. Comparing base (77ebb16) to head (f484a14).

Files with missing lines Patch % Lines
payload_format.go 76.92% 6 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #362      +/-   ##
==========================================
- Coverage   82.64%   82.58%   -0.07%     
==========================================
  Files          28       29       +1     
  Lines        3440     3479      +39     
==========================================
+ Hits         2843     2873      +30     
- Misses        427      433       +6     
- Partials      170      173       +3     
Flag Coverage Δ
go 82.58% <76.92%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JoTurk JoTurk requested review from Sean-Der, jech and jwetzell May 25, 2026 05:29
@JoTurk
Copy link
Copy Markdown
Member Author

JoTurk commented May 25, 2026

@jech you didn't like the current RTP interface, what do you think about this one?

@JoTurk JoTurk requested a review from at-wat May 25, 2026 05:44
@sirzooro
Copy link
Copy Markdown
Contributor

sirzooro commented May 25, 2026

There are many things marked as deprecated in pion/rtp, it would be good to remove them during migration to v3.

@JoTurk
Copy link
Copy Markdown
Member Author

JoTurk commented May 25, 2026

@sirzooro thank you for the reminder, and yes this won't be the last PR, we should also move some stuff around.

@JoTurk JoTurk requested a review from sirzooro May 25, 2026 12:35
Comment thread payload_format.go

package rtp

// MediaFormat describes the bitstream or sample representation passed to a payload format.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much is applicable here, but I think whenever possible the RFC should be referenced for naming and rational behind particular concepts. Even if its just a comment with the RFC # + section.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly application logic but I'll try to reference all my reasons, and specs behind every interface.

Comment thread payload_format.go
@jech
Copy link
Copy Markdown
Member

jech commented May 25, 2026

I need to think it over, but at first sight this looks good. Two questions.

Why do we need the PacketizerContext as an explicit parameter? The Packetizer is a stateful object, right, so why not make the context part of the Packetizer?

Why do we need the Flush method?

@JoTurk
Copy link
Copy Markdown
Member Author

JoTurk commented May 25, 2026

@jech yeah I agree it was the first iteration and I just added too much, going to remove context from the public input, the user only needs to configure de/packetizers once (e.g avc vs annexb for h264).
Also agree about flush it's not needed since the new interface owns all the logic.
Going to remove them both.

@jech
Copy link
Copy Markdown
Member

jech commented May 27, 2026 via email

@fippo
Copy link
Copy Markdown

fippo commented May 27, 2026

A keyframe is a frame with no dependencies to prior frames. https://medooze.medium.com/mastering-the-av1-svc-chains-a4b2a6a23925 is still a very good reference. RTP shouldn't know about this ideally.

@JoTurk
Copy link
Copy Markdown
Member Author

JoTurk commented May 27, 2026

@fippo In av1, RTP needs to know if it's a keyframe to set the N bit correctly, this is a limitation in Pion/RTP AV1 impl is that we set the N bit for all seq-headers and not only for key frames, because I didn't want to parse again in the packetizer

rtp/codecs/av1_packet_test.go

Lines 1447 to 1490 in 1edc72d

{
Name: "Sequence header should start a new packet",
MTU: 1000,
InputPayload: (testAV1MultiOBUsPayload{
{
Header: &obu.Header{
Type: obu.OBUFrameHeader,
HasSizeField: true,
},
Payload: []byte{0x01, 0x02, 0x03, 0x04, 0x05},
},
{
Header: &obu.Header{
Type: obu.OBUSequenceHeader,
},
Payload: []byte{0x06, 0x07, 0x08, 0x09, 0x0A},
},
}).Marshal(),
OutputPayloads: [][]byte{
append(
(testAV1AggregationHeader{
W: 1,
}).Marshal(),
(testAV1OBUPayload{
Header: &obu.Header{
Type: obu.OBUFrameHeader,
},
Payload: []byte{0x01, 0x02, 0x03, 0x04, 0x05},
}).Marshal()...,
),
append(
(testAV1AggregationHeader{
W: 1,
N: true,
}).Marshal(),
(testAV1OBUPayload{
Header: &obu.Header{
Type: obu.OBUSequenceHeader,
},
Payload: []byte{0x06, 0x07, 0x08, 0x09, 0x0A},
}).Marshal()...,
),
},
},
Libwebrtc sets it correctly to only keyframes https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/rtp_rtcp/source/rtp_packetizer_av1.cc;l=377?q=rtp_packetizer_av1.cc&ss=chromium%2Fchromium%2Fsrc

While it seems like setting the N flag for all seq headers doesn't break any impls, e.g chrome doesn't even construct full / real frames, because decoders don't require them. https://chromium.googlesource.com/external/webrtc/+/HEAD/modules/rtp_rtcp/source/video_rtp_depacketizer_av1.cc#387

But this change will make us correct Pion AV1 and other codecs impls that needs to be aware about "key frame".

A similar limitation is that we don't set the P flag at all in our hevc impl, and we should set it at the end of video seq, This redesign should make all of this possible without parsing again inside RTP (and pion/webrtc and in the user application sometimes e.g with the current design the user has to do avc to annex-b before passing it to pion!).

@jech
Copy link
Copy Markdown
Member

jech commented May 27, 2026

A keyframe is a frame with no dependencies to prior frames. https://medooze.medium.com/mastering-the-av1-svc-chains-a4b2a6a23925 is still a very good reference. RTP shouldn't know about this ideally.

In addition to what Joe said, applications need to be able to identify a keyframe. For example, when a client joins a conference, it must not try to decode the stream until it sees a keyframe.

@jwetzell
Copy link
Copy Markdown

@JoTurk I think I'm piecing this together so the PacketInfo here would kind of in the realm of this in the chromium example?

Comment thread payload_format.go
}

// MediaSample is the media input passed to a payload-format packetizer.
type MediaSample struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no timestamp field here. It would would be needed to implement extensions like ones from RFC 6051, or deal with gaps in media stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a way to set codec specific payload descriptor flags to Payloader Always set Marker bit to false?

5 participants